In the last lesson we learned why we use differential privacy to mitigate privacy attacks. In this lesson we'll dig deeper into the Laplace Mechanism with the goal of understanding how we can tune our random noise in such a way that individuals are protected and privacy is preserved.
Let's return to the Laplacian distribution
While this captured the shape of the distribution, it is not the only laplace distribution. The width of the distribution can be expanded or contracted using the parameter ε. For the distribution above, we used ε=4. Use the slider below to visualize how the distribution changes as we change ε.
As you can see, raising epsilon shrinks the width of the distribution which increases the chances that a random number draw would select the value 0. In privacy terms a higher epsilon means less privacy. The tradeoff between privacy and utility is a tricky one.
Let's put this knowlege in the context of our balding bears example. In the last example we drew a histogram of the results of 1000 random number draws and compared the values for the dataset without Oski to the dataset with Oski. Here's the graph to refresh your memory:
This time, let's overlay the actual laplace distributions to see how epsilon would have effected our results.
This graph looks just like the bar graph! In fact, if you slide epsilon to 4 you will get the exact distributions that generated our histogram. As you keep sliding epsilon past 10 you'll notice that the intersection of the two lines shrinks. This means that Oski's presence will be easier to identify. As you slide epsilon down to 0.1 the two distributions overlap more and more. This is bad news for the reporter trying to out Oski because it means that any single draw from the distribution is very inaccurate.
Choosing the correct epsilon for your specific needs is a difficult task. The correct balance between privacy and accuracy is different for every dataset. Our tools are designed to take some of the guess work out of the choice.
Read on to learn about the two other parameters that affect privacy.
Next Lesson: Sensitivity